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Abstract — This paper provides the importance of System Design using FPGA for Engineering Education to 
utilize it as a commercial device to deliver a product, proposed by comparing with various technolo^gi^^N^d 
tools with concern to Area, Speed and Power consumption of commercially available high-capacity ^TOAs. 
According to the requirement of field applications the FPGAs are suitable to reach the Tima^^ Market 
(TTM) of a product by avoiding the role of process industries up to some extent. 



►PLD\, Masking and 
Slilce ASICs.FPGAs 
Tiethods would be 



Keywords — Fullcustoms,FPGAs,Latches,Power,Area,Speed, K-Maps,CPLDs 
I. Introduction 

derived into the following methods 



The recent developing trends in VLSI Technology are Full custom design,ASICs,FPGAsj 
Non Masldng of PLDs, Top to DN, DN to Top approaches of the semicustomdesi|i 
Field-Programmable Gate Arrays (FPGAs). The Architecture and implementator 
discussed here in detaill 

PLDs C/ 

#' 

Types of IC Technologies and Implementation methods are mauCyJleriv 
shown in Figure- 



lC-Teclinologylmpleniemation y 

r 




'Pre Diffused 7 


Pre Wired 


(Gate Airays) 


(FPGAs) . 



Figure-i Types of IC-Technologies 
he Full custom design flow of the VLSI Technology is more concerned with r 



Full Custom D^s^rNT^e I 

based technolag|^Jurh consists of Sub-micron level 

^toVal Level /Architectural level exploration and simulation by using HDL Languages like 
IE/ Verilog/ Verilog-a/ABEL for Conceive Design Implementation and Operation CDIO 
Jkhanisms 

Transistors schematic design to implement the Boolean function 'F' consists of A, B, C... by using 
the formula F-UP/PMOS, F-DN/NMOS network can be obtained from the formula given below for 
PMOS, NMOS respectively 



-Equation-i F-DN/NMOS = (Boolean function F)'- 



F-UP/PMOS = F (A, B', C, ) - 

Equation-2 

In this formula Demorgan's principles enormously used to get schematic design and simulations (by 
adjusting the transistor sizing between PMOS/Power-UP network or NMOS/Power-DN network, 
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calculating the Id values and all) has to meet the Behavioral level's simulations by doing the iterations and 
then continued for the layout design described in the part 'C 

C. In the Layout design we need to follow the Lambda based or Submicron rules (Micro based 
design) to implement the Layout designs by using the standard layers (here four layer technology of 
AMS-Austrian Micro Systems based process rules are using) by special Manhattan's principle, 
Euler-Path techniques were used to develop the layout to get the Optimistic values of getting Chip 
Area, Power, High speed, low cost for bulk production is here Layout (vs) Schematic are compared 
if they are equal(number of devices, number of interconnects ) then proceeds for the insertion of 
this layout into the PAD frame(Chip Assembly) for Electro Static Discharge ESD protecMotl 1^0 
buffers for i/p,o/p connections, Analog protected input output frames (APRIOP) for the circuit 
protection and then converted to the GDS-II file format(Tape Out) for Fabrication Qty^try by 
inserting the CAP NET on the total layout design in the pad frame. 



Custom Flow 




^^^^2 Flow of Full Custom Design 

This process method would gipl^Tiore efficient results because it's a pure handcrafting techniques are 
used(Intel 4004 Microproceas^S^^the worlds Single ChipMicroProcessor was developed by fullcustom 
designs in 1971 and now ^rMjjJblementing for super computers or Hyper Supercomputers only especially 
for Timing Modules like^T^,DLL,A/D,D/A converter at which the clock recovery mechanisms are more 
important) concern tfjNga, Low Power, High Speed but the more skilled designer is needed, Time to reach 
the market woulc/D\*very much slow may be (months to year ) though it is suitable for bulk/mass 
production witiA^rost, and also It is too tough for developing countries which doesn't have the process 
industries, to^et phe fabricated chip would consume more time which may not reach the market to launch 
the develpjfe^product, If any error comes while designing the product/ at the time of processing may leads 
to the i*s^*y time, cost, scope of rectification to the design and fault finding in each level of design stage 
ere time. 



Cadence ICFB tools (consists of Verilog-a, Virtuoso Schematic Composer for transistors level schematic 
capture, Analog environment with Spectre simulators, Virtuoso Layout for layout by using LSW window, 
Diva for DRC (design rule checking) 

DRC:- Check the layout for design (sub-micron)rule violations 

Extract:- Create a extracted view of the layout. This view is used for simulations. 

Markers:- Explain: click on the marker to find out the design rule violated. Remove all the markers after a 
DRC run, 
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Semicustom Designs: Application Specific Integrated Circuits Design flow (ASIC) In this design process 
some latest techniques can also be incorporated for Layout designs by using Physically Knowledgeable 
Synthesis (PKS) for layouts, NC-Launch and Soc-Encounter too. 

Common Design flow for Semicustom Designs of ASIC and FPGA are in continuation with the following 



The Require] 




nd Necessities Lead the Technology towards Poor Men's ASICs 



Programmable Logjx^fsigns: Programmable Logic Designs (PLDs) which are mostly in three important 
construction mfltdefc J 



y^jj^rog] 



<5 



SimjvS^rogrammable Logic Devices(SPLD's) PROM, PLAs, PALs shown in Figures 
(3i^*ex Programmable Logic Devices(CPLD's) Figure 
^^^lcf Programmable Gate Arrays (FPGAs) in figure Figure 



An ad hoc approch to laying out a logic regular strctured design was adopted called as gate array 
structure.Predictablity over the logic is possible then improve area, performance by reducing two level 
layout i.e The Logic Shifts the Layout into SOP/POS Fashion of Prediffused/Mask Programmable Arrays, 

Bacthes of wafers containing arrays of primitive cells (or) Transistors are manufacture by the vendors & 
stored, with all fabrication steps are standadised & executed without regraded to the Final Application 
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A. Layer of gates implement AND operations (prod) 

B. Layer of gates implement OR operations (sum) are Sparingly used today's semicustom logicdesign 

PROM Structure: PROM structure consists of Fixed AND Array, Programmable OR ARRAY 

ln P ut ^ [ Fixed I i J Programmable [output 

' AND Array / p OR Array p 



Figure 4-Programmable Read Only Memory (PROM) 




PAL structure: Programmable Array Logic structure consists of Programmable AND Array, Fixeififfi«array 



<5 
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Figure- 6 Programmable Logic Array (PLA) 
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Demerits with SPLDs- Pre routed Channels are power hungry If Prediffused cell for 4-I/P then using it for 
2-I/P wastes I/Ps, The multiple alternative cells creates geometry / Oxide isolation The "dogbone" 
terminations on the layout increases the Gate to poly(high resistance) In theses models which ever is more 
programmable nature will always tries to give the optimum logic and gives the good performence along 
with lowest Don't care states For Example PROM will have more number of Don't Care States than the 
PLA due to the (AND&OR) Arrays both are programmable. 

The "dogbone" terminations, Longer fingure sizes on the layout increases the Gate/polyresistance (No 
option for free hand craft techniques for folding fingure of gate wire length) 



HH 



Sea of gate cell using Oxide 
Isolation between gates 



Figure- 7 



7 Dogb*j\s 




Here Masking(Non-Programmable Array) and Nj(JS»askmg (Programmable Array) concepts needs to be 
under stand in an efficient manner to utilize these slnlls in System Designs at field levelimplementation, Pre 
Masked part gives optimistic results(PowrfT$Aj^a,Speed) point of view that's why they never allowed to 
change /programm those modules/arrayaw^^ 



CPLD's: CPLDs consists of like 
Input/Output pins on S/CPLD \ 



1^ 



blocks connected with Programmable interconnectmatrix with 



This PLDs structures andS^^^bles lead the concepts to preprocessed die that can be programmed at the 
field levels which acts as^^>orrmen's ASICs i.e.FPGA (without the help of Fab centre) 




Figure: 8 A generic structure of CPLDs 
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Field programmable gate arrays design flow consists of the folowing steps which reduces the design cost 
by low cost software.not much complex and low cost for the designers, fast TTM, executed at field level. 



HDL 

modelling 



Placing & Routing 



Post layout Simulation^ ^ 
(Back annota^ ^r^ ♦ 



2£ 



Downloading the design 
rt^FPGA 



4? 



i i i i i i 



►FPGAs Design & Implementation Flow 



Fi^re->FP( 

Here I would like to ex^re^^pme internal contents and steps to execute the design steps for FPGAs 
implementations by writipg%pre Verilog-code and get the simulations and must be synthesizable style of 
the design module. AA^^me help of technology mapping and place and route the Configurable Logic 
Blocks (CLBs) into ^ptiferetic manner to get the low power, area by avoiding the longest interconnects. 

DXV Process Technologies &Memory in FPGAs 
| | 
| Bipolar Process Technology | | Unipolar/ CMOS Process Technology] 
^ i I 
1 Fuse-Programmed(Blown) 

^IpROM ^PROM ^LASH *^?AM ^VrltNF 
-^-Non-Volatile Nature ^ 

ISP-lnSystem Programmable 

Figure: 10 Classification of FPGAs based on memories 



ti-Fuse 

Ex- PLICE, 
ViaLink / 
MicroVia 
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Figure: uReliability of the FPGAs 



Ty FPGAs he 

Volatile 


Vendor's Name 


Web Links 


Approximate 




XiliM.Inc, 












Flash 
basedFPGAs 















Figure: 12 Vfei^rs for FPGAs 

^(^•^ Structures 
'^^rained Architectures 



Xilinx High reconfigurable natu^ gives less performence Area, Delay time.Power Ex-Look Up Tables 
LUT's in Xilinxhe has becom^a^e/haior vendor in the market 




Figure: 13 Market Survey for FPGAs 



Coarsegrained Architectures 

Dynamic Precision Scaling(DPS) blocks in Altrea Example DPS in ALTREA, Less reconfigurable nature due 
to it's Fixed position(masked) gives high performence Area.Delay time.Power Because the path which is 
allocated is fixed and can't reconfigured in any case^ Nothing but Masking and kept the rights into 
vendor's usage at the time of layout designs/Asic is called as masldng 
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The FPGA Architecture is shown in figure below consists of CLBs,I/OBlocks,BlockRAMS 



####### 



####### 
####### 



I 



Configun 
Blocks 



Figure: 14 FPGAs Structure and Architecture 



tier r«rW*tha] 



Think about the Logic Modules which can be incorporate one near by the other r/fflta*than the rando: 
placing over all the FPGA area. Here no physical layout designs to manipulate forJ^jftresig 



The design continued with the following steps; simulation, synthesis, optirr^^ic design by logic (HDL 
code), place and route; user constraint file consists of the details aboi*tNfe\af ports of the design module, 
physical ports of the FPGA board are used to communicate betwe^^ro|tt^n and FPGA board and to load 
the designed module in the form of bit-stream could be reconfigu/eVpy the designer at the field level by 
themselves 



Manufacturing cycle for ASIC is very costly, lengthy and e* 
at design time have large impact on development time a 




r et lots of manpower, Mistakes not detected 



FPGAs are perft 
applications, ai 



\rapid prototyping of digital circuits Easy upgrades like in case of software, unique 
nfigurable computing comparisons with ASIC and FPGA incorporated below 
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Figure: 16 Power comparisons for Xilinx Spartan 3 to 6 FPGAs 
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Majority of FPGA vendors in the market for SRAM-based FPGAs: Xilinx, Inc., Altera^ Corp., Atmel, 
Lattice Semiconductor refer the figure for market shares 



Efficiency 


Performance 
Delay Time, Low Power & Area 


NRE-Cost 


Unit-Cost 


TTM & Rapid Prototyping 


t 


ASIC 


ASIC 


FPGA 


FPGA 


FPGA 


FPGA 


ASIC 


ASIC , 



Figure: 17 comparisons between ASICs and FPGAs 



Flash & antifuse FPGAs: Actel, Quick Logic CorpPrimary products: FPGAs and the< ated CAD 
Software are listed below because the highest market share occupied by the venC^r ^klinx.Inc. Main 
headquarters in San Jose, CA Fables* Semiconductor Company with earliest of Alliance and 

Foundation Series Design Software have been used 

|f is p 

1 1 r 




1*0^ Figure: 19 The new Road Map for IGLOO-Nano FPGAs 
Iran 3 



1500K=1.5M 
Equivalent Logic Gates 



XC3S1500 



320 



Speed 




Package 


grade -4 




Type 



Figure: 20 Spartan3A with Nomenclature 
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Xilinx consists of Vertex family boards for High performance; Spartan family is a Low Cost Family that we 
have used Spartan3A/3E, Spartan 6/7 too. Architectural details were described below for Spartan3A with 
Nomenclature 




Figure: 21 Internal Structure of the Spartan3A 




Slice 
Logic cell | 

Logic cell | 


Slice 


Slice f 
Logiocell^i 


Police 
' Logic cell | 

Logic cell | 



Figure: 22 Structure^^prh Configurable L-Block 
Each slice contains two sets of the Following 



Four-input LUT, Any 4-input logicfef!|cflWi or 16-bit x 1 sync RAM (SLICEM only) or 16-bit shift registers 
(SLICEM only) Carry & Control^ast\rithmetic logic, Multiplier logic, Multiplexer logic, Storage element, 
Latch or flip-flop, Set and reset^Hlue or inverted inputs, Sync, or async. Control after the design synthesis 
Map report with the FPQA tf^rcrare, software details and design date , number of errors, warnings, logic 
utilization like number oAffis«I flip-flops, number of 4-Input LUTs, number of used LUTs and number 
LUTs used for route try^gbflTost layout, Timing, P&R report, resource utilization with design statistics 




Figure: 23 Internal Structure of Configurable L-Block 
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LUT's: Look Up Tables are primary elements for Logic implementations, Each LUT c 
function of 4 Inputs Example shown below 



implement any 




Figure: 24 Internal Structure of Configurable L-Bloi 



LUT'S follow the Rent's rule pertains to the organization of computi 
between the number of external signal connections to a logic bloi 
number of logic gates in the logic block, and has been applied to ci, 
to mainframe computers. 5-Input Functions implemented usinj 
any function of 5 inputs 

• Logic function is partitioned between two LUTs . 

• F5 multiplexer selects LUT 




specifically the relationship 
number of "pins") with the 
Ringing from small digital circuits 
[ Ts One CLB Slice can implement 




Ore: 25 Internal Structure of each Configurable L-Block 




Figure: 26 Internal Structure of I/O -Block of Spartan 3 
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Here need to take care while writing the HDL's for any design 
While writing the HDL code 

i) The case statement should be end with the default value of case otherwise it will generate and uses one 
extra Latch which consumes more power for Example 2 i/p binary count/ selects 

Case 00, 

Caseoi, 

Caseio, 

Casen, 

Again we need to mention the caseoo; in last stage otherwise it definitely generate the Latch we can fci^^rve 
in the synthesis. * 

Observe the following code 

Model of a Flip Flop with 

asynchronous reset ~ y 

always @(posedge clock) 

q<=d; Cfr 

always ©(reset) \fN ♦ 

if (reset) 
assign q = l'bo; 

else />V^ 
deassign q; /*here if the deassign q is not mentioned the Latch \^ujd be generated along with the Flip Flop 
in the synthesis. 



,0- 



S^poe co 



2) For the 2variable K-Map, approach i.e 00,01,11,10 Grafc^Kfe counting method, like wise in Mux based 
select/ counter mechanisms also use the same mffrw^d, if we use the binary count oo,oi,io,n which 
consumes 6 switching states where as in gray me^^Wit takes only 4 switching states which would directly 
impact on the Dynamic power consumption^tfae Dynamic power = V2 CV 2 binary counter getting 2 
times more switching power than the gray ^Ije mechanism here power consumption effects are due to the 
fo-M depends on clock frequency is also jfpjwHable for CMOS switching activities too for primitive gates. 

3) For mission critical applicatiori tf^^be stick on to the coding method as if, else, if, else method rather 
than looking for case/wait statA|ents, and also for the FSMs use the MOORE model is better than the 
usage of MELAY Machine iL^fe^ility/ mission critical application to avoid the catastrophe though it's 
speeder than the MOORj^l^power consumption is major constraint use the MELAY machine for field 
level implementations^V^ 

4) Implementat^i/oNiTfe design View the placed and routed design in FPGA Editor Set up multiple place 
and route runs^^jSWr design 
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tat^orTof 
isjuWW 
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Steps to download onto FPGA 



Create a programming file (.bit) to program your FPGA 

Generate a PROM, ACE, or JTAG file for debugging or to download to the device, Use iMPACT to 
program the device with a programming cable 



Generate Programming File 

Bitsream (.bit) for FPGA 

PROM image file (.mcs) for non volatile memory 

Configure Device 

Use a JTAG download cable, Load bitstream directly on to FPGA, Load PROM im< 
memory Using a PROM Serial and Parallel interface, Xilinx or 3rd party solutions 



— Xilinx Cable 



e^o 



ie<to non -volatile 




Figure: 28 

Reconfigurability of FPGA devices can/ 




lG Cable specifications 
-configured to change logic function while resident in the 



system. Design updates or modific^t*nWfe easy, and can be made to products already in the field. An 
FPGA can even be reconfigured ^ynar^rally to perform different functions at different times 
Number of Bits to Program a SprfrNan-3 Generation FPGA and Smallest Platform Flash PROM 







Family 


FPG^ 


^^lumber of configurable 
* Bits 


Smallest Possible Platform FLASH 
PROM 


Spartan-3A 


Xg^A 


1,886,560 


XCF02S 


Spartan- f 
3AN.CS 


S&3S700A 


2,732,640 


XCF02S 



|j Figure: 29 Internal Structure of Configurable L-Block 

Or 

JT^Interface: Spartan-3 Generation FPGAs and the Platform Flash PROMs both have a four-wire IEEE 
1149.1/1532 JTAG port. Both the FPGA and the PROM share the JTAG TCK clock input and the TMS mode 
select input. The devices may connect in either order on the JTAG chain With the TDO output of one 
device feeding the TDI input of the following device in the chain. The TDO output of the last device in the 
JTAG chain drives the JTAG connector 

Set the FPGA board with default values as shown below on the board 
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0 CHECK JUMPER SETTINGS! @ 0pfof)a , Connect VGA display. 

@ Option 

(5) Connect AC wall adapter. I m - m 

t ( 

(6)Turn on power switct 




engineering applications 



Conclusions 



Here the Xilinx FPGAs more suitable to implement at the* 
because it's flexibility and Market share. 
Keeping some notes about the FPGA Design can also be ^y\pp-down & down-top approach by 
taking the design module writing HDL, Simulate j^^£r^ng test vectors and checking the 
simulation results according to specifications can he amcd as top- to -down approach here the 
design implemented by Logics, K-Maps, Truth tabldb«c* 

Synthesis for the above same design using UCILVaml R and generating the .bit file and loading on 
to the FPGA to get it implement on the boM^Esled as down to top approach. Iterative methods 
can be done for better results (Area,SpeefcQnwer) by verifying the Logics, K-Maps, Truth tables 
can be obtained in this synthesis delivereAby the tool would be same as designed in step-2 (top- 
down), but the Logic implementatimN^ould be changed and optimized by using the Shannon's 
principle , to implement all logics^mMfcrx based implementations by implementing the 



4) Shannon's principle 



Boolean function f(wi,w2,...wn, 
f(wi,w2,...wn) = wi'.f (oj^fl^p 



2 



)(caj»be writtt 
wtnj + wi.f(i, 



c^^^fcx b. 



written in the format 
wn) 



1) ExampleThree-inpufrt^ff implemented with 2-to-i Mux 



<5 



.a 



Figure: 34 Three-input XOR 
2)Optimized circuit for Three-input XOR gate implemented with a 4-to-iMux 
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0 0 0 

0 0 1 

0 1 0 

0 1 1 

1 0 0 
1 0 1 
1 1 0 
1 1 1 




Figure: 32 Internal Structure of Configurable L-Block 



using the FPGAs in good manner to get synthesized and occupied very less 
and maps to the RTL, Technology file can be checked after the synthesis. 

O For easy understanding purpose took the NAND Gate as an example 



<tKJ 

A- 

:^nNerthe layout 



design 



DL for Nand Gate 
module NANDgate(A, B,F); 
input [0:0] A; 
input [0:0] B; 
output [0:0] F; 
reg F; 

//The Process Starts 
always® (A or B) 
begin 

F<= ~(A & B); 
end 

endmodule 




^^^^ Figure: 33 Simulation results for NAND gate 

the results^^da be like this but never comes as usual changes to the eqivalent circuit 




Figure: 34 NAND GATE 
After synthesis we got the Values for truth table and Karnaugh -Map 
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